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ABSTRACT 


This  study  deals  with  the  problem  of  approximating  the  probability 
distribution  function  of  the  project  duration  in  probabilistic  activity 
networks.  It  describes  a  procedure  that  has  been  developed,  programmed 
and  tested,  using  activity  networks  of  real  life  projects  as  well  as 
randomly  generated  ones.  The  procedure  allows  the  activity  duration  to 
have  any  of  the  following  distributions:  Uniform,  Triangular,  Normal, 
Exponential,  Gamma,  Beta  or  any  discrete  distribution  presented  in  a 
finite  set  of  ordered  pairs.  The  computational  experience  indicates  that 
the  resultant  probability  distribution  function  (pdf)  is  very  close  to 
the  actual  pdf,  the  latter  is  obtained  through  extensive  Monte  Carlo 
sampling.  In  fact,  computational  experience  shows  that  the  pdf  obtained 
by  Monte  Carlo  sampling  converges  to  the  approximate  pdf  as  the  sample 
size  increases .  The  procedure  is  programmed  in  FORTRAN  and  the  CPU  time 
for  any  moderate  size  project  (i.e.,  G(N,A  ) ^  G(60,200))  is  less  than 
half  a  minute  on  AMDAHL  V-7.  <rt~  Z. 


I.  INTRODUCTION 


One  of  the  main  difficulties  in  probabilistic  activity  networks 
(PANs)  is  the  determination  of  the  probability  distribution  function 
(pdf)  of  the  project  completion  time.  Approximating  the  probability 
distribution  function  becomes  very  desirable  if  it  can  be  easily  per¬ 
formed  and  if  it  results  in  an  estimation  close  to  the  actual  pdf. 
Before  introducing  the  proposed  approximating  procedure,  which  has 
these  two  features,  the  following  definitions  and  symbols  will  be  used 
throughout  the  discussions  to  follow: 


G(N,A) : 

|A|: 

i: 

a: 

NS (a): 
NE(a) : 
IN(i): 
OUT ( i)  : 

S(-): 


PRE(i)  « 


*  : 


An  activity  network  with  N  nodes  and  A  arcs.  Nodes 
represent  events  and  arcs  represent  activities.  Node  i  is 
connected  directly  to  node  j,  where  i<  j ,  by  at  most  one  arc. 
Number  of  arcs  in  A. 


Denotes  a  node,  and  is  indexed  from  1  to  N. 


Denotes  an  arc,  and  is  indexed  from  1  to  |a|. 


Starting  node  of  activity  a. 
End  node  of  activity  a. 
Indegree  of  node  i. 


Outdegree  of  node  i. 


For  either  node  or  arc 


if  the  argument  variable  is  "inactive" 


1  If  the  argument  variable  is  "active" 
where  an  "active"  node  or  arc  is  one 
that  is  retained  in  the  final 
(irreducible)  network. 


{a  |  NE(a)  ■  i  and  5(a)  =1},  the  set  of  active  arcs  that 


precede  node  i. 


A  convolution  operator. 

Minimum  realization  of  an  activity. 


v:  Maximum  realization  of  an  activity. 

a:  First  parameter  for  the  Exponential,  Gamma  and  Beta  distri¬ 

butions. 


B: 

m: 
f  (x) : 
IML: 

MCS  = 

F(-): 
px<x)  = 


The  second  parameter  for  the  Beta  distribution. 

Mode  of  a  probability  distribution  function. 

Density  function  of  the  random  variable  X,  i.e.,  f(x)  =  dF(x) . 

A  list  of  nodes  in  the  AN  whose  pdf's  are  desired,  i.e., 
milestone  or  key  nodes. 

/*■ 

0  if  Monte  Carlo  sampling  is  not  desired 
1  otherwise 

The  Fast  Fourier  Transformation  of  the  argument. 

p(X=x),  the  probability  mass  of  the  random  variable  X  and 
in  short  is  denoted  by  p(x). 


NIN:  Number  of  intervals  in  the  sampled  distributions  (pdf's  ob¬ 

tained  by  Monte  Carlo  sampling) . 


A  node,  i,  is  realized  at  time  T^,  which  is  a  random  variable  whose 
pdf  is  denoted  by  F(t),  or  simply  F(i) .  On  the  other  hand,  an  activity 
is  denoted  by  a,  and  has  a  duration  X a  which  is  also  a  r.v.  whose  pdf  is 
denoted  by  F(x) ,  or  simply  F(a) .  Furthermore,  we  denote  CF(*)  the  count 
of  discrete  values  assumed  by  the  r.v..  Hence,  the  pdf  F(*)  may  be  repre¬ 
sented  by  a  set  of  ordered  pairs  {(x  ,  p(x  ))}  for  nodes  and  { (x  ,  p(x  ))} 

m  m  m  m 

for  arcs,  m  «  1,2,3, ... ,CF( •) •  When  we  wish  to  refer  to  either  arc  or 

node  we  write  { (R  ,  p(R  )}.  Let  NRR  denote  the  desired  number  of  ordered 
m  m  ' 


pairs  in  the  above  set  of  ordered  pairs. 

The  approximating  procedure  consists  of  the  following  five  consecutive 


1  -  Generating  Random  Activity  Networks  (GRAN) 

2  -  Discretizing  the  Continuous  Distributions  (DISCRT) 

3  -  Reducing  the  Network  to  Its  Irreducible  Form  (SCAN) 

4  -  Sequentially  Approximating  the  Irreducible  AN  (APRXMT) 

5  -  Testing  the  Accuracy  of  the  Approximate  pdf  (SIMULT)  and  (MAVGDV) 

A  brief  discussion  of  the  functions  of  each  step  is  given  below;  however, 
a  detailed  discussion  is  given  in  the  subsequent  sections.  Flowchart  1 
gives  the  outline  of  the  approximating  procedure  (the  driver  program). 

In  the  first  step,  if  G(N,A)  is  not  part  of  the  input  data,  then  it 
is  randomly  generated  from  the  space  of  all  feasible  AN's  with  the  speci¬ 
fied  N  and  |a|.  In  such  a  case  GRAN  (a  random  activity  network  generator) 
is  used  and  a  rule  for  assigning  a  pdf  to  each  arc  has  to  be  specified.  An 
activity  can  have  one  of  the  following  six  continuous  distributions: 

Uniform,  Triangular,  Normal,  Exponential,  Gamma,  and  Beta, 

or  any  discrete  distribution  presented  in  a  finite  set  of  ordered  pairs. 

If  any  a  e  A  has  a  continuous  pdf,  then  DISCRT  is  used  to  approximate 
such  a  distribution  by  a  discrete  one.  Different  discretization  methods 
are  presented  in  Section  III. 

The  first  step  in  estimating  the  pdf  of  the  project  completion  time 
is  to  reduce  the  AN  wherever  possible.  Two  kinds  of  reductions  can  occur: 

a.  Convoluting  two  activities  In  series,  which  gives  rise  to  a  new 
activity.  This  occurs  if  there  is  a  node  i  with 

IN(i)  -  OUT(i)  -  1, 

such  a  node  will  be  deactivated  after  the  convolution  operation, 
i.e.,  old  <5(i)  =  1  becomes  new  6(i)  -  0. 


b.  Taking  the  maximum  over  two  activities  having  the  same  starting 
and  ending  nodes  (in  parallel) . 

An  efficient  search  method  for  effecting  such  reduction  is  developed,  it 
is  denoted  by  "SCAN".  It  reduces  the  AN  to  its  irreducible  form  (IAN). 
Details  of  SCAN  are  given  in  Section  IV.  The  IAN  can  be  used  to  determine 
the  unique  activities;  where  an  activity  is  "unique"  if  it  is  an  element 
of  only  one  path. 

If  the  irreducible  network  has  more  than  one  arc,  then  sequential 
approximation,  which  is  the  subject  of  Section  V,  is  used  to  determine 
F(N)  as  well  as  the  pdf  of  every  active  node  in  the  IAN.  In  certain  cases 
other  active  nodes  in  the  IAN,  beside  node  N,  are  of  interest  to  the  ana¬ 
lyst.  The  program  "APRXMT"  prints  the  pdf  of  each  of  these  nodes  in  the 
form  of  a  table  as  well  as  a  digital  plot.  Along  with  the  pdf,  he  mean 
and  the  standard  deviation  are  also  given.  All  such  nodes  are  listed  in 
the  input  data  under  the  symbol  I ML.  If  all  the  nodes  of  the  AN  are  elements 
of  the  set  IML  then  the  sequential  approximation  is  applied  to  the  AN  without 
the  use  of  SCAN. 

From  this  introduction  we  realize  that  the  error  in  the  approximation 
occurs  due  to  three  causes: 

1  -  Discretization:  The  error  in  approximating  the  continuous  pdf  by 
a  discrete  one  can  be  kept  at  any  desired  level  by  simply  choosing  the 
proper  spacing,  d,  in  DISCRT. 

2  -  Independence  of  the  Nodes:  In  the  sequential  approximation  we 
assume  that  the  nodes  are  independent,  where,  in  fact,  some  nodes  may  be 
dependent.  If  the  pdf  of  each  arc  in  the  AN  is  approximated  by  a  normal 
distribution,  and  we  are  interested  in  characterizing  the  pdf  of  the  project 

completion  time  by  its  first  four  moments,  then  we  can  use  the  results 
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developed  by  C.  Clark  [1]  .  However,  in  this  report  we  are  interested  in 
approximating  the  total  pdf  of  the  project  completion  time,  not  only 
some  of  its  moments,  when  the  pdf  of  each  activity  may  not  be  normal. 

This  is  why  we  chose  not  to  follow  Clark's  approach,  though  we  recognize 
its  usefulness  in  the  special  circumstances  to  which  it  is  applicable. 

3  -  Reducing  the  Dimension  of  F(i) :  In  the  sequential  approximation 
if  CF(i)  >  NRR,  then  F(i)  is  approximated  by  F*(i)  which  has  only  NRR 
ordered  pairs.  The  error  caused  by  this  approximation  can  be  controlled 
by  choosing  large  NRR.  However,  for  practical  purposes  NRR  can  be  chosen 
to  be  between  20  and  30;  otherwise,  the  program  may  run  into  storage 
problems,  especially  if  N  and  | A |  are  large.  If  NRR  is  not  binding,  then 
all  sets  F(i)  represent  the  exact  pdf's  if  all  arc  pdf's  were  discrete 
from  the  outset;  otherwise,  the  error  in  F(i)  is  limited  to  what  is  caused 
by  the  first  two  factors. 

To  be  able  to  measure  the  error  in  approximating  F(N) ,  the  actual 
F(N)  is  obtained  by  extensive  Monte  Carlo  sampling  of  the  original  AN 
using  the  actual  distribution  functions.  Then  the  maximum  absolute  devia¬ 
tions  and  the  average  absolute  deviations  between  the  two  distributions  are 
determined.  Also,  the  actual  average  and  standard  deviation  of  the  original 
pdf  are  compared  with  the  approximate  mean  and  standard  deviation.  This  is 
done  in  Section  VI  through  the  use  of  SIMULT  and  MAVGDV . 

The  computational  experience  presented  in  Section  Section  VII  indicates 
that  the  approximate  pdf  is  very  close  to  that  obtained  by  extensive  MCS. 

In  fact,  this  experience  shows  that  as  the  sample  size  in  MCS  increases  the 
above  four  measures  approach  the  corresponding  values  obtained  by  the 
approximating  procedure. 


The  procedure  has  been  programmed  using  FORTRAN  IV.  It  is  easy  to 
operate  and  can  be  used  for  any  size  network  after  making  the  necessary 
storage  adjustment.  Appendix  C  describes  the  input  requirements.  The 
program  was  tested  for  ANs  of  sizes  (N,A)  £  (60,200).  The  CPU  time  in 
all  cases  was  less  than  thirty  seconds  on  AMDAHL  V-7  excluding  the  MCS 
time.  The  program  listing  may  be  obtained  from: 

Graduate  Program  in  Operations  Research 
N.C.  State  University 
P.0.  Box  5511 
Raleigh,  N.C.  27650 
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II.  GENERATING  RANDOM  ACTIVITY  NETWORKS 


The  activity  network  G(N,A)  is  either  given  a  priori,  which  can  be  an 
acyclic  network  of  a  real  life  project,  or  is  specified  by  only  N  and  |a| . 

In  the  latter  case  it  is  desired  to  generate  a  feasible  acyclic  network 
and  to  assign  a  pdf  to  each  activity. 

Any  procedure  used  to  generate  G(N,A)  must  guarantee  that  such  a  net¬ 
work  has  equal  probability  to  be  chosen  from  the  set  of  all  feasible  AN's 
with  the  specified  number  of  N  nodes  and  [ A [  arcs.  Either  of  the  following 
two  procedures,  due  to  Herroelen  [5],  can  be  used  to  guarantee  the  complete 
randomization  of  G(N,A). 

1  -  Deletion  Method:  It  starts  with  a  completely  connected  acyclic 
network,  i.e.,  the  upper  triangle  of  the  adjacency  matrix  is  filled  with 
ones.  Hence  we  start  the  process  with  N(N-l)/2  arcs  in  hand,  and  it  is 
desired  to  delete 

K  =  N(N-l) /2  -  { Af 

arcs  subject  to  the  constraints 

i)  The  generated  network  is  feasible,  i.e., 

IN(i)  >  1  for  all  i  ^  1 
OUT(i)  ^  1  for  a11  i  +  N 

ii)  The  generated  network  is  completely  randomized,  i.e.,  all  net¬ 
works  possessing  this  count  of  N  and  |a]  are  equally  probable. 

The  Deletion  Method  does  just  that.  For  the  sake  of  completeness,  the  theory 
is  presented  in  Appendix  A.  The  method  proceeds  as  follows: 


* yjart  f*r  ******  *f-Y'  • 


a.  Let  OUT(i)  =  N  -  i  for  all  i  *  N 
IN(i)  =  i  -  1  for  all  i  +  1 


b. 


c. 


d. 


e. 


f. 


and  set  L  -  0 

Generate  a  random  number,  denoted  by  r^,  where  r^  ~  U(0,1)  then 
let  j  =  }_N  +  1/2  -  /N(N-l)r1  +  1/4_J  (1) 

If  OUT(j)  =  1  or  IN(i)  =  1  for  all  i  >  j  go  to  b;  otherwise  con¬ 
tinue. 

Generate  another  random  number,  denoted  by  where  ~  U(0,1) 
and  let  k  =  J  j  +  1  +  ^(N-j)  | 

If  IN(k)  «  1  go  to  d;  otherwise  arc  a  =  (j,k)  is  deleted,  l.e., 

6(a)  =  0.  Update  IN(k)  and  OUT(j)  and  put  L  =  L  +  1. 

If  L  <  K  go  to  b;  otherwise  a  completely  randomized  G(N,A)  is  in 
hand  and  the  process  stops. 


2  -  Addition  Method:  This  is  the  reverse  process;  it  starts  with  the 
adjacency  matrix  filled  with  zeroes  except  for 

6(1,2)  =  6 (N-l,  N)  =  1, 

which  guarantees  one  start  and  one  terminal  node.  The  Addition  Method  then 
proceeds  to  generate  the  remaining  ) A  j  -  2  arcs  subject  to  constraints  (i) 
and  (ii)  above.  Unfortunately,  the  Addition  Method  as  developed  by 
Herroelen  [5]  may  generate  more  than  j  a}  —  2  arcs;  it  may  generate  an  extra 
M  arcs  where  0  5  M  <  N-3,  especially  if  { A |  <  2N-4.  In  Appendix  A,  where  the 
theory  of  the  addition  method  is  presented,  we  dwell  more  on  this  problem. 

The  Addition  Method  deals  with  two  sets  of  arcs:  the  first  set  has  the 
"feasibility  arcs"  which  range  from  (N-3)  to  (2N-6)  arcs,  and  the  second  set 
has  the  remainder  of  the  arcs  (if  it  is  nonempty) .  The  second  set  will  be 
denoted  by  "free  arcs".  The  Addition  Method  generates  as  many  as  possible  of 


the  free  arcs  in  a  random  fashion  which  (as  shown  in  Appendix  A)  may  increase 
as  more  nodes  become  feasible.  The  procedure  keeps  track  of  the  infeasible 
nodes,  i.e.,  nodes  with  zero  indegree  or  zero  outdegree  (excluding,  of  course, 
nodes  1  and  N) .  If  the  number  of  free  arcs  is  reduced  to  zero,  then  the 
procedure  generates  the  arcs  necessary  for  feasibility;  this  step  may  cause 
the  generation  of  more  than  | A |  arcs.  This  problem  is  solved  by  using  the 
Addition  Method  to  delete  the  M  extra  arcs  at  random  without  violating  the 
feasibility  of  the  AN.  The  Addition  Method  is  summarized  in  Flowchart  2. 

Obviously,  either  method  can  be  used  to  generate  G(N,A) ;  however,  if 
the  network  is  dense,  then  the  Deletion  Method  may  be  preferred  since  less 
arcs  are  to  be  deleted  than  added.  The  Deletion  Method  is  used  if 

J A)  >  N(N-l) / A  , 

otherwise  the  Addition  Method  is  used.  Unfortunately,  the  Addition  Method 
is  not  as  efficient  as  the  Deletion  Method,  and,  in  fact,  is  harder  to 
program.  Tests  of  both  procedures  for  large  N  and  j  A |  proved  the  validity 
of  the  above  rule.  Both  methods  are  used  in  the  approximating  procedure  and 
access  to  each  method  is  possible  by  the  automatic  use  of  the  above  rule. 


11 


Start 


Read  N  and  A 


Set:  K  =  I A  j ,  m  =  n  =  N  -  3 
6(a)  =  0  for  all  a  i  1,K 
6(1)  =  6(K)  =  1  and  L  = 
f  =  K  -  L  -  m  ■ 


f  <  0? 


Locate  a  node  j 
with  IN(j *)  =  0 

*  * 

generate  a  node  i  <  j 


Set:  L  =  L  +  1  and 
6(a)  =  1 

Update  m,  n  and  f. 


m:  number  of  non-receiving  nodes,  i_.£. ,  nodes  with  zero  indegree, 
n:  number  of  non-emitting  nodes,  1^. e. ,  nodes  with  zero  outdegree. 


Flowchart  2 


The  Addition  Method 


III.  DISCRETIZING  CONTINUOUS  DISTRIBUTIONS 


Determining  or  approximating  the  pdf  of  the  project  duration  depends 
on  the  pdf  of  each  activity  in  G(N,A).  For  example,  in  Figure  1,  the  r.v. 


T^  is  a  function  of  all  the  activities,  as  shown  in  Equation  (2). 

T^  =  Max{X1  +  X^,  X1  +  X3  +  X5,  X2  +  X$} 

Determining  F(t)  may  not  be  a  simple  matter  especially  if  some  (or  all) 
of  the  activities  have  different  continuous  distributions.  Other  net¬ 
works  may  be  more  complicated.  The  determination  of  F(t)  is  made  easier 
if  activities  1  through  5  have  discrete  distributions;  in  such  a  case  the 
digital  computer  can  be  used  to  determine,  or  approximate,  F(t) 

The  first  step  in  the  approximating  procedure  is  to  discretize  all 
the  continuous  distributions  in  the  AN.  This  is  done  by  determining  a 
set  of  ordered  pairs  denoting  F(a).  The  cardinality  of  CF(a)  depends  on 
the  desired  accuracy  of  the  discretization.  Using  the  closeness  to  the 
exact  values  of  the  first  five  moments  as  a  criterion  to  determine  the  count 
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of  points  in  the  pdf  F(a) ,  denoted  by  CF(a);  three  methods  have  been  pro¬ 
posed  and  tried.  The  most  efficient  in  terms  of  accuracy  and  computer 
time  is  a  hybrid  of  methods  2  and  3  described  below.  These  three  methods 
are: 

1  -  The  2m  Method:  If  we  decide  that  CF(a)  =  m  for  any  activity  a 
with  continuous  distribution,  then  from  the  definition  of  F(a)  we  have 
2m  unknowns:  m  realizations  and  the  corresponding  m  probabilities.  The 
first  2m  moments  of  the  continuous  distribution  can  be  used  to  construct 
the  following  system  of  2m  nonlinear  equations: 

m 

l  *£p(xk)  -  en,  for  n  =  0,1,2, ... ,2m-l 
k=l 

where 

e  *  E(Xn)  =  xndF(x)dx,  the  n^*1  moment. 

11  J_oo 

In  a  matrix  form  we  have: 

VP  *=  E 

where  V  is  the  Vandermonde  matrix  of  dimension  2raxm  and  P  is  the  prob¬ 
ability  vector  with  m  components,  and  E  is  the  vector  of  the  2m  moments. 

Two  methods  have  been  tried  to  solve  this  system  of  nonlinear  equations, 
but  unfortunately  neither  succeeded  for  m  >  8.  These  methods  are  pre¬ 
sented  in  Appendix  B  for  completeness. 

2  -  Using  Equal  Distances:  Based  on  the  distribution  of  the  activity, 
the  minimum  and  maximum  realization  values  u  and  v  can  be  determined;  then 
by  the  use  of  an  appropriate  spacing  &,  depending  on  the  desired  accuracy. 
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the  range  (v-u)  can  be  subdivided  into  equal  intervals.  In  this  case 

x^  ■  u  +  A (k-1)  V  k  =  1 , 2 , 3 , . . . ,m  where  m  * 

As  a  rule  in  this  study  the  minimum  and  maximum  realizations  are  deter¬ 
mined  such  that 

p(X<u)  *  p(X>v)  =  0.0005. 


The  corresponding  probabilities  are  determined  according  to 
xk+A/2 

p(x.)  =  dF(x)dx  for  each  k=  2 ,3 ,4, . . . ,m-l, 

V‘/2 

rU+A/2  r°° 

and  p(x..)  =  p(u)  *  dF(x)dx  and  p(x  )  =  p(v)  =  dF(x)dx. 

1  i-co  m  v-A/2 

For  a  small  A  (large  m)  the  determination  of  the  probability  can  be 
approximated  by  p(y^)  *  A  f(y^)  for  each  k  =  l,2,...,m  where  y^  is  the 
center  of  the  k  interval,  i.e.,  yk  =  u  +  A(k-l)  +  A/2,  and  f(y)  *  dF(y). 
This  approach,  as  it  appears  in  Figure  2,  treats  all  points  in  the  range 
of  the  r.v.  in  a  uniform  fashion,  i.e.,  we  partition  the  range  into  equal 
distances.  This  makes  the  discretization  suitable  for  the  use  of  the 
Fast  Fourier  Transformation  (FFT)  method  in  the  successive  approximation 
discussed  in  Section  V.  It  is  also  very  convenient  for  some  distributions 
such  as  the  uniform,  and  the  triangular  distributions,  and  some  other  dis¬ 
tributions  when  their  skewness  or  peaks  are  not  very  acute.  If  sharp 
peaks  are  present,  such  as  the  case  in  the  exponential  distribution  with 
large  parameter  «,  or  the  normal  distribution  with  small  o,  then  very 
small  values  of  A  are  used  to  minimize  the  errors  of  approximation.  This 
drawback  led  to  the  use  of  the  following  alternative. 
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3  -  Using  Equal  Probabilities:  Here,  again,  u  and  v  are  determined  in 
the  same  way  as  in  Method  2.  Then  F(a)  is  determined  according  to: 

A  •  p(x^)  “  1/m  f°r  a  8*ven  ®> 

and 

i  k 

»  F  (  l  P(x  )  -  d/2) 

j  =1  J 

using  the  continuous  distribution  function.  This  scheme  is  suitable  for 
all  distributions  under  consideration.  However,  it  may  not  be  easy  (or 
it  can  be  time  consuming)  to  invert  some  of  the  distribution  functions. 
Hence  its  use  is  limited  to  the  exponential  distribution,  where  it  is 
needed  the  most,  while  the  method  of  equal  distances  is  used  for  the  re¬ 
maining  five  distributions.  Figures  2  and  3  illustrate  methods  2  and  3 
respectively  for  the  exponential  distribution  with  parameter  «  =  1 . 

Figure  3  shows  that  Method  3  responds  to  the  peak  of  the  pdf  by  taking 
more  realizations,  where  Figure  2  shows  that  Method  2  does  not  respond 
to  peaks. 


Probability  Discretization  for  the  Exponential  Distribution 


IV.  REDUCING  THE  NETWORK  TO  ITS  IRREDUCIBLE  FORM 


In  an  AN  it  is  always  possible  to  combine  two  arcs  in  series  to  form 
a  new  arc.  Such  an  operation  is  accomplished  through  "convolution".  Each 
convolution  operation  reduces  N  and  | A |  each  by  one;  actually  the  network 
gains  a  new  arc,  but  loses  two  arcs;  this  gain  and  loss  are  represented 
by  the  node  and  arc  indicators.  For  example,  in  Figure  4  arc  a^  =  (i^,  i  ) 
is  convoluted  with  arc  a, 2  =  (±2,  i^)  to  give  arc  a^  =  (i^,  i3) . 


/T  \  a  =  a,  *  a 
)  ~3  "I 


Figure  4 

Convolution  Operation 


This  operation  introduces  the  following  changes: 


<5  (i^)  =  0  and  ^(a.^)  =  0 
6<a2)  =  0 
5  (a^)  =*  1 
x3  -  x,*  x2 


and  the  pdf  of  a,  is  defined  by  the  set  F(a_)  =  {(y>  p(y))},  where 
-J  y  -J 

PY  (y)  ■  p(X,  -  y)  -  l  Py  (x)p  (y-x). 

*3  J  x-0  1  2 
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If  the  nodes  and  in  Figure  4  were  originally  connected  with 
an  arc,  say  a^,  then  the  two  arcs  a^  and  a^  can  be  "reduced"  to  one  arc  £<., 
with  duration  •  Now 

X5  =  Max{X3,  X4> 

and  the  pdf  of  X^  is  defined  by  the  set  F(a^) =  {(z,  p(z))},where 
F,-(z)  «  F3(z)*F4(z),  and  p,.(2)  *  F^(z)  -  F^(z  )  for  a  z  slightly  less 
than  z.  Figure  5  illustrates  this  process.  Such  an  operation  is  called 
"maximum"  operation. 


Figure  5 

Convolution  and  Maximum  Operations 
The  "maximum"  operation  reduces  the  AN  by  one  arc;  it  sets 

<S(I3)  -  0 

6(a4)  -  0 

and  S(§3)  ■  1 
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The  reduction  process  starts  with  a  convolution  operation,  then  a 
sequence  of  maximum  and  convolution  operations  may  follow.  The  process 
may  start  with  any  of  the  initial  convolutions  without  fear  of  not  reaching 
the  irreducible  form  of  the  AN.  Both  operations  (convolution  and  maximum) 
maintain  the  topological  sorting  of  the  AN. 

The  search  for  the  convolution  and  maximum  operations  and  the  necessary 
bookkeeping  which  goes  with  each  operation  is  carried  out  by  an  algorithm 
called  "SCAN",  which  is  based  on  the  following  two  evident  observations: 

(i)  For  any  node  i,  if  IN(i)  =  OUT(i)  =  1  then  the  arcs  entering  i 
and  emanating  from  i  form  a  convolution  operation. 

In  some  projects  node  i  can  be  considered  a  milestone  event,  i.e.,  i  e  IML, 
and  in  such  a  case  8 (i)  *  1  throughout  the  approximating  process  and  the 
convolution  operation  may  not  be  carried  out. 

(ii)  For  any  two  arcs  a  4  a/  if 

NS (a)  =  NS (a’) 
and  NE (a)  =  NE(a') 

then  a  and  a*  form  a  maximum  operation. 

At  the  initial  step  of  the  reduction  process  there  does  not  exist  any  maximum 
operation  since  there  is  at  most  one  arc  connecting  any  two  nodes  i  and  j 
directly,  where  i  <  j.  The  condition  for  the  maximum  operation  develops 
after  the  occurrence  of  at  least  one  convolution  operation;  hence,  we 
initially  check  for  a  convolution  operation  using  the  first  observation 
and  the  following  result  which  can  be  proved  by  induction: 

Assertion  1:  In  any  AN  if  IN(i)  +  OOT(i)  >  2  for  all  nodes  i  4  1  or  N, 
then  the  AN  is  irreducible. 

Therefore,  if  IN(i)  +  OUT(i)  >  2  for  all  i  j4  1  or  N,  then  the  process  moves 
on  to  the  sequential  approximation.  In  this  case  the  calculations  are  limited 


to  only  N-2  additions  and  comparisons.  However,  if  there  is  a  node  i  ^  1 
or  N  such  that 

IN(i)  +  OUT ( i)  =  2, 

then  a  convolution  operation  is  possible  which,  if  carried  out,  might  give 
rise  to  maximum  and  convolution  operations;  then  the  search  for  both 
operations  becomes  necessary.  The  search  procedure  based  on  observations 
(i)  and  (ii)  above  is  presented  in  Flowchart  3.  This  discussion  leads  to 
the  following  result: 

Assertion  2:  For  any  node  i  ^  1  or  N  in  an  irreducible  activity  network 

IN(i)  +  OUT(i)  _>  3  . 

SCAN  was  programmed  and  tested  using  networks  generated  by  GRAN  as  well 
as  networks  of  actual  projects. 

If  the  network  is  reducible  then 

SCAN:  G(N, A)  ->  G(N'  ,A') 

where 


2  £  N'  <  N 

and 

1  <  |A'|  <  |A| 

If  N'  =  2  then  j A ' [  =  1  and  the  network  is  said  to  be  "completely  reducible". 
Then  the  approximating  procedure  terminates  with  the  pdf  of  without  resort 
to  the  sequential  approximation  process  presented  in  Section  V.  This  is 
the  ideal  situation  where  no  approximation  is  made  except  in  the  discre¬ 
tization  stage.  Such  is  the  case  ..n  the  AN  of  Figure  6  where  pdf  of  T 


reached  after  3  convolutions  and  2  maximum  operations. 


Calculate  IN(i)-,  ,  , ,  . 

OUT(i)}  f°r  311  1 
and  set  Ml™ 1 A 1 


/Does  3  an^ 
iIN(i)=OUT(i)=J 
id  1  4  IML 


The  network  is 
irreducible 


A  convolution  operation 
is  conducted  and  all 
necessary  bookkeeping 
is  done.  Set  M1=M1+1 


(k)»l 


Printout  G(N’,A’) 
(the  irreducible 
AN)  I 


k  >  Ml 


Condition  (ij 


Condition  (iy 


A  maximum  operation  is  con¬ 
ducted  and  all  bookkeeping 
is  done.  Set  Ml  ■  Ml  +  1 


Flowchart  3 

An  Algorithm  for  Determining  the 
Irreducible  Network 


r 
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V.  SEQUENTIAL  APPROXIMATION  OF  IRREDUCIBLE  ANS 

The  procedure  to  be  described  may  be  used  for  any  AN  with  N  >  2  and 
|a|  >  1,  reducible  or  irreducible.  However,  as  it  is  shown  in  Flowchart  I, 
we  enter  this  step  of  the  approximating  procedure  only  with  irreducible 
networks;  such  a  network  was  the  result  of  the  SCAN  operation,  and  is 
denoted  by  G(N',  A').  It  is  represented  in  the  memory  of  the  processor 
by  the  active  nodes  and  arcs,  i.e.,  only  nodes  and  arcs  with  6's  equal  to  1. 
This  designation  allows  us  to  maintain  the  original  structure  of  the  AN. 

Two  of  the  active  nodes  are:  the  starting  node,  1,  where  it  has  the 
realization  time  0  with  probability  1,  i.e.,  F(l)  =  {(0,1)},  and  the 
terminal  node,  N,  representing  the  project  completion  event.  The  F(N) 
is  the  one  we  are  after.  Starting  at  node  1  we  proceed  to  approximate 
the  pdf  of  every  active  node,  in  increasing  order,  ending  with  node  N. 

The  name  "sequential  approximation"  is  derived  from  this  step.  The  pdf 
of  each  active  node  is  approximated  by  the  following  procedure,  which  is 
illustrated  in  Flowchart  4. 

Without  loss  of  generality  assume  the  process  is  at  node  i;  node  i 
is  active,  hence 

IN(i)  >  1  or  OUT(i)  >  1,  IN(i)  +  OUT(i)  >  3,  or  ieiML, 
see  Figure  7  for  illustration,  then: 


L 
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1  -  Determine  the  set  of  active  arcs  terminating  in  node  i,  i.e., 

PRE(i)  *  {a  |  6(a)  =  1  and  NE(a)  *  i}. 

2  -  For  each  a  €  PRE(i) 

(a)  Convolute  F(a)  with  F(NS(a)).  Denote  this  convolution 

by  F(i  ) . 
a 

(b)  If  k  -  CF(ia)  >  NRR  then  use  the  operator: 

APPR:  F(i  )  -*•  F'(i  ) 
a  a 

where  CF'(i  )  *  NRR,  i.c. ,  approximate  the  k  ordered  pairs 
a 

by  only  NRR  ordered  pairs.  This  is  done  according  to  the 
rules: 

(i)  The  full  range  of  the  realization  of  the  project  end¬ 
ing  at  node  i  is  maintained. 


(ii)  The  intermediate  k-2  points  are  mapped  into  NRR-2  points 
using  the  following  three  steps: 
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(1)  Let  A  =  (R^_^  ”  R2)/(NRR-2)  »  then  we  have  NRR-2 

th 

intervals,  each  is  of  width  A.  The  n  interval 

must  contain  all  realizations  R  €  (R  +  A(n-l), 

m  1 

R1  +  nA] . 

(2)  For  the  realizations  in  the  interval  let 

x_  =  I  R  x  p(r  )  and  y„  =  l  p(r  )• 

n  “  m  m  n  **  m 

m  m 

then 

(R\  P(R'))  =  (x  /y  ,  y  ) . 
n  n  n  n  n 

(3)  If  the  n1"*1  interval  is  empty,  then 

(R‘  P(R'))  =  (R  +  A(n-0.5),  0). 

n  n  l 

3  -  F(i)  =  Max{F ' ( i  )};  notice  that  if  APPR  is  not  used  then  F'(-)  =  F(-). 

j  -ii 

This  function  is  separable,  hence,  to  avoid  any  unexpected  escalation  in  the 
storage  requirements,  the  maximum  can  be  performed  sequentially,  then  the 
operator  APPR  in  step  (b)  above  can  be  used  whenever  it  is  necessary. 

For  instance,  we  let 

F(ij,  i2)  =  MaxfF'Uj),  F’(i2>} 

then  we  determine  Max{F’(i3),  FCi^,  i 2> }  and  so  on.  Hence  the  process  ter¬ 
minates  with  F(i)  where  CF(i)  <  NRR.  The  process  also  determines  the  mean 
and  the  standard  deviation,  and  if  i  £  IML,  then  F(i)  is  printed  out  in  the 
form  of  a  table  and  digital  plot.  The  process  moves  on  to  active  node  i' 
immediately  succeeding  node  i.  If  more  than  one  node  succeed  node  i,  then 
the  process  starts  with  the  smallest  numbered  node. 


The  maximum  operation  used  in  this  step  and  in  SCAN  is  performed  in 
the  usual  manner.  Perhaps  it  is  best  explained  by  the  example  with  the 


data  in  Table  1. 


1 

2 

3  (Max.  of  1&2) 

a 

p(xi> 

x2 

p(x2) 

x3 

_ 

P  (X3) 

■ 

1/3 

B 

1/8 

2/24 

B 

1/3 

3/8 

H 

6/24 

B 

1/3 

B 

3/8 

H 

13/24 

B 

B 

1/8 

■  . 

■ 

3/24 

Table  1 

pdf  of  Two  Arcs  or  an  Arc  and  a  Node 


F(3)  -  Max{F(l) ,  F(2)} 

=  {(2,  1/24),  (3,  3/24),  (4,  3/24),  (5,  1/24),  (2,  1/24),  (3,  3/24), 

(4,  3/24),  (5,  1/24),  (4,  1/24),  (4,  3/24),  (4,  3/24),  (5,  1/24)} 

-  ((2,  2/24),  (3,  6/24),  (4,  13/24),  (5,  3/24)} 

2 

The  complexity  of  the  maximum  operation  is  of  O(NRR)  . 

The  convolution  operation  can  be  performed  by  either  the  usual  formula 
or  through  the  use  of  Fast  Fourier  Transformation  (FFT) .  Testing  both 
methods  for  different  AN's  resulted  in  the  preference  of  the  usual  formula 
of  convolution.  This  is  due  to  the  following  factors: 

(i)  FFT  requires  integer  subscripts,  since  for  a  vector 

c  -  <co,  c1,...,cn_1)  6  En 

n-l  , 

F(c  )  ■  £  c,  exp(2iirjk/n) ,  V  j  ■  0,1,2, ...  ,n-l  where  i  *  -1. 

J  k-0  * 
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Hence,  the  realization  of  all  activities  has  to  be  transferred  into  multiples 
of  the  greatest  common  divisor;  which  is  a  burdensome  requirement  by  itself 
and  poses  storage  difficulties,  compounded  with  matching  increases  in  the 
numbers  of  calculations.  Therefore,  before  FFT  is  used  to  convolute 

(c,  p(c))  and  (d,  p(d)), 

each  of  the  vectors  c^  and  d^  has  to  be  stretched  over  the  range 

(0,1,2,  ....  2K)  where  K  =  p-og^c^  +  dn_1)~|  . 

Then,  the  corresponding  probabilities  have  to  be  assigned  accordingly.  The 
example  given  below  illustrates  such  a  preparatory  step. 

(ii)  The  use  of  FFT  to  convolute  c  and  cl  above,  after  being  prepared, 
proceeds  as  follows: 

(a)  Determine  F(p(^))  and  F(p(d))  using  the  definition  of 
F( • )  in  (i)  above. 

(b)  Let  F(p(«0)  be  the  pairwise  product  of  F(p(c))  and  F(p(d)). 

(c)  Invert  using  F-* (F(p(«0  )  . 

Obviously,  the  three  steps  involve  many  transformations  and  the  complexity  of 

K  K 

the  procedure,  at  best,  is  of  the  0(2  ln2  )  =  O(nlogn),  since  we  defined 
K  =  log2n  ,  see  [  7]  . 

(iii)  After  p(e)  is  obtained,  a  reduction  transformation  may  be  necessary 

since  the  set  {(e^,  p(ei))}  is  of  length  2  and  many  of  the  probabili¬ 
ties  p(ei)  *  0,  especially  at  the  start  and  end  tails  of  the  set. 

Also,  a  common  factor  may  be  subtracted  from  the  realizations 
so  that  they  may  start  at  zero,  i.e.,  use  a  shifting  operator. 

The  following  example,  which  uses  the  data  of  and  X2  in  Table  1,  illus¬ 

trates  the  use  of  both  methods;  usual  convolution  (definitional  form) 
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and  the  preparatory  step  of  FFT.  Step  (ii)  above  was  processed  on  the 
computer  and  resulted  in  a  pdf  close  to  the  exact  pdf  presented  in  Table 
2.  The  deviation  is  due  to  the  truncations  and  transformations  used  in 
the  FFT  procedure  (step  (ii)  above). 

Using  the  definition  of  the  c-  nvolution  operator  to  X^  and  of 
Table  1  gives 


Y  =  X  *  X2  where 


PY  (y)  =  l  Px  (z)px  (y-z) 

z=0  i.  ‘2 


This  operation  is  summarized  in  Table  2  below 


y 

p(y) 

r -  "  •"  "  1  * — — 

P(y) 

3 

1/24 

1/24  =  .0417 

4 

3/24  +  1/24 

5/24  =  .2083 

5 

3/24  +  3/24 

11/24  =  .4583 

6 

1/24  4-  3/24  +  1/24 

16/24  =  .6667 

7 

1/24  +  3/24 

20/24  =  .8333 

8 

3/24 

23/24  =  .9583 

9 

1/24 

24/24  =  1.000 
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and  the  probability  vectors  are  assigned  accordingly.  Therefore: 


n= 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11  . . . 

16 

p(x1)= 

0 

1/3 

1/3 

0 

1/3 

0 

0 

0 

0 

0 

0 

0  ... 

0 

p(x2)  = 

0 

0 

1/8 

3/8 

3/8 

1/8 

0 

0 

0 

0 

0 

0  ... 

0 

II 

>r* 

Ou 

0 

0 

0 

.0416 

.1665 

.2497 

.2081 

.1665 

.1249 

.0416 

0 

0  . . . 

0 

Table  3 

Convolution  Using  FFT  Method 


The  value  of  n  can  be  reduced  to  8  by  subtracting 

Min{ Y }  =  Min{X1)  +  Min{X2) 

=  1+2  =  3 

Hence  Table  3  can  be  transferred  to  Table  4  after  dropping  all  entries 
with  zero  probabilities  at  each  end  of  the  Table. 


Table  4 

The  Reduction  of  Table  3 
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VI.  TESTING  THE  ACCURACY  OF  THE  APPROXIMATE  PDF 

The  accuracy  of  the  approximate  pdf  of  the  project  duraticn,  repre¬ 
sented  by  F(N)  can  be  measured  by  its  closeness  to  the  "true"  pdf,  de¬ 
noted  by  F' (N) .  Such  "closeness"  is  measured  either  by  the  maximum  value 
of  the  absolute  deviation  of  F(N)  from  F'(N),  denoted  by  MDV,  or  the 
average  value  of  the  absolute  deviations,  denoted  by  ADV. 

This  Section  deals  with  the  problem  of  determining  such  maximum 
and  average  deviations.  It  first  deals  with  determining  F'(N).  This 
is  the  subject  of  the  first  subsection,  where  Monte  Carlo  samp] ing  is 
used,  since  obtaining  F' (N)  analytically  is  not  feasible  in  the  majority 
of  cases.  Then,  in  subsection  2,  linear  interpolation  is  used  to 
determine  the  maximum  and  average  absolute  deviations  (MDV  and  ADV). 

This  segment  of  the  Approximating  Procedure  is  used  only  as  an  evalua¬ 
tion  tool.  Access  to  this  test  is  possible  by  setting  the  parameter  MCS  =  1 
in  the  input  data.  Sampling  consumes  a  lot  of  CPU  time,  hence  in  dealing 
with  large  AN's,  allocation  of  time  should  be  considered  before  setting 
MCS  =  1.  The  following  is  a  discussion  of  the  sampling  model  called 
"SIMULT". 

1  -  Sampling  the  Activity  Network:  Monte  Carlo  sampling  of  the  AN 
is  performed  using  subroutine  SIMULT.  It  assigns  a  random  number  to  each 
arc  of  the  original  network,  G(N,A),  generated  from  the  original  pdf  of 
the  activity,  continuous  or  discrete.  Then  SIMULT  determines  the  comple¬ 
tion  time  of  the  project  using  the  longest  path  method;  this  also  results 


in  the  realization  times  of  all  nodes  in  the  AN.  These  two  steps  are 


repeated  for  a  "satisfactory"  number  of  times.  Here  the  number  of  samples 


should  be  "large"  to  guarantee  that  F' (N)  is  very  close  to  the  "true" 
pdf  of  the  project  duration  time.  For  example.  Table  5  shows  for  G(10,15) 
the  improvement  in  MDV  and  ADV  as  the  number  of  samples  increases. 

Hence  extensive  Monte  Carlo  sampling  is  necessary;  otherwise  the  values 
of  MDV  and  ADV  may  have  to  be  adjusted  to  reflect  the  error  in  the  sampled 
distribution  F* (N) .  The  trend  of  improvement  in  the  values  of  MDV  and  ADV 
in  Table  5  indicates  that  F(N)  may  be  closer  to  the  "true"  pdf  than  these  two 
measures  indicate.  _ 


Problem 

Dists. 

Type 

No.  of 
MCSs 

1 

All 

2 

All 

300 

3 

All 

500 

4 

All 

750 

5 _ 

All 

!  Comparison  of  the  Approximate  pdf  with  that  of  MCSj 
1  « _ 1  c  I 


Average 

S.  Deviation 

MDV 

ADV 

APRX. 

MCS 

APRX. 

MCS 

28.46 

27.74 

3.808 

4.434 

0.0556 

0.0174 

28.46 

27.83 

3.808 

3.772 

0.0568 

0.0158 

28.46 

27.80 

3.808 

3.690 

0.0584 

0.0159 

28.46 

3.808 

3.904 

0.0338 

28.46 

28.04 

3.808 

0.0274 

Table  5 


Effects  of  Number  of  Samples 
on  the  Measures  of  Performance 


The  gathered  information  through  sampling  is  tabulated  for  each  of 
the  critical  nodes  (elements  of  IML)  or  only  for  node  N.  Each  table  has 
three  columns;  these  are:  completion  times  ,  corresponding  probabili¬ 
ties  p(R^)  and  the  accumulative  probabilities  P(R^).  The  first  two 
columns  represent  the  ordered  pairs  of  the  pdf  F(N).  Table  6  was  gene¬ 
rated  using  SIMULT, while  Table  7  is  the  corresponding  table  using  the 
approximating  procedure  for  the  same  AN, which  is  0(10, lb)  with  all  the 
pdf's  under  cons iderat ion  are  used. 


The  SIMULT  subroutine  has  the  potential  of  simulating  any  AN  provided 


that  each  of  its  arcs  has  one  of  the  following  distributions: 

1  -  Uniform 

2  -  Triangular 

3  -  Normal 

4  -  Exponential 

5  -  Gamma 

6  -  Beta 

7  -  Discrete  distributions  or  customer  specified  distributions,  where 

these  two  cases  are  represented  by  a  set  of  finite  ordered  pairs. 

Each  of  the  distributions  is  identified  to  SIMULT  by  its  number  in  the 
above  listing,  denoted  by  I.  Hence  I  =  1  if  the  activity  has  a  uniform 
distribution,  and  I  =  2  if  the  distribution  is  triangular,  and  so  on. 

The  distribution  identity  I  associated  with  activity  a  is  denoted  by 
the  symbol  NDST(a).  Each  distribution  is  characterized  by  the  following 
four  parameters  -  (where  the  use  of  each  parameter  is  explained  in  the 
discussion  of  each  distribution) : 


EX (I)  : 


STDX(t) : 


VMIN(I) : 


VMAX(I) : 


a  real  value  representing  the  mean  in  some  distributions, 
and  in  some  others  it  is  used  to  input  other  parameters 
(such  as  the  first  parameter  for  Gamma  and  Beta), 
real  value  representing  the  standard  deviation  in  some  dis¬ 
tributions  and  another  parameter  in  some  other  distributions. 

minimum  value  of  the  random  variable  X  ,  determined  by  the 

a 

criterion  used  in  DISCRT. 

as  VMIN(I)  except  it  represents  the  maximum  of  X  . 

a 


i 


The  "true"  distribution  function  represented  by  the  Monte  Carlo  sampling 
for  the  original  network. 


J 

REAL IZAT ION 

PRO  0  AS  IL  I  T  Y 

ACC.  PR03. 

1 

1 5. 9925 

.  1 OOOOOF-02 

. 1 OOOQOE-02 

2 

16.9665 

.3C0000F-02 

. 4C0000F-02 

3 

18.0322 

.300000E-02 

. 70000  OE—  0  2 

A 

18.  8939 

. 300000E-02 

.9599996-02 

5 

19.5900 

•  9  99 999 E—  02 

•  20000  OE -01 

6 

20.5030 

.  1530006-0 1 

•  3  50  0  0  OE—  0 1 

7 

21.3058 

. 22  0  OOOF -  0 1 

. 56999  9F -01 

8 

22.2076 

.280  0006-0  1 

•  fi  499996—  0 1 

9 

22*  9867 

. 399999E  —  0 1 

. 1 25000 

10 

23.9390 

•  5 59999 E—  0 1 

. 181000 

1  1 

24. 8600 

•559999E— 0 1 

.237000 

1  2 

25.6529 

• 799996F - 0 1 

.316999 

l  3 

26.5022 

. 849996E  — 0  1 

.401999 

1  4 

27.41 74 

.8199966-0 1 

•  4  6  399  8 

1  5 

28. 1 880 

.7999966-01 

.563998 

16 

29.1118 

. 8  199966-0  1 

. £45998 

l  7 

29.9693 

.7899966-0 l 

.724997 

1  8 

30. 8674 

. 6999986-01 

. 794997 

1  9 

31  .6662 

.5999996-0 1 

.654997 

20 

32.4584 

•  3 1 9999F  —  0 1 

.886997 

2  1 

33.4634 

.3499996-01 

.921997 

22 

34.2368 

.2600006-01 

.947997 

23 

35. 0414 

. 2200006-01 

.969997 

24 

36.1 395 

.9999996-02 

.979996 

25 

36. 921 3 

.7999996-02 

.987996 

26 

37.7353 

.5999996-02 

.  993996 

27 

38.2952 

.200  000E-02 

.995996 

28 

39.477 e 

.◦ 

.995996 

29 

40.7200 

.3000006-02 

.998996 

30 

42.8619 

. 1 000006—02 

.999996 

average 

duration  of  the 

project  =  28.04  and 

its  S.D.  =  4.005 

Table  6 

F( 10)  Generated  by  SIMULT 
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J 

RF Ai_  IZAT  ION 

PROBABILITY 

ACC.  PROS. 

1 

8.75000 

.1 3997  OE  — 1 6 

.1 39970E-16 

2 

9. 78504 

.954873F-1 7 

. 2  35457E - 1 6 

3 

10.6420 

•  359588E—  t  I 

. 3  5959  OE— l 1 

4 

12.6604 

•  3  4 1 903E— 07 

•  341 939£— 07 

5 

14.0789 

.  1 24399F-05 

. 1 278 1 8E— 05 

6 

15. 1314 

.1  10340E-04 

. 123122E-04 

7 

16.4698 

.  14C472F-03 

. 152785F-03 

8 

1 7.5058 

.  102961E-05 

.  1538 1 4E— 03 

9 

1 8. 2941 

,21 5  756E—  02 

.231 137E-02 

10 

20.4425 

.  196633F-01 

.2  19747E-01 

1  1 

21. 1891 

.0 

•  2  19  74  7E—  0 1 

12 

21.9082 

•  2848  786- 01 

.50462 5E-01 

1  3 

23. 1433 

.5  39  027E—  0 1 

.  I  04  36  5 

1  4 

24.4393 

.93294SE-01 

.  1  97660 

1  5 

25.7718 

. i I5e?4 

. 313554 

1  6 

27.0051 

.  1  18442 

.431995 

1  7 

28.4196 

.  155535 

.567530 

18 

29. 8350 

.  1  18626 

.706156 

19 

31 . 1364 

•  9334  0  OE—  0 l 

.799496 

20 

32.3999 

, 800499F— 0 1 

.879546 

2  1 

33.7556 

•5  554  4  4E -0 1 

.935090 

22 

35. 0372 

. 313673E-01 

.966458 

23 

36.2193 

•  1  8  5 1 6 1 E—  0 1 

.964974 

24 

37.4626 

•  7  7966  7 E  — 02 

.992770 

25 

38. 7723 

•  57  32  34F - 02 

.598503 

26 

40.2990 

.  108035E  — 02 

.999583 

27 

41.6743 

. 3659S5E-03 

.999949 

28 

43.0373 

•214648E-04 

.995970 

2  <3 

44.5454 

«  7226  03F— 05 

.999978 

30 

47.2500 

.  1  77S79E-37 

.999578 

erage 

duration  of  the 

project  =  28.46 

and  its  S.D.  =  3.808 

Table  7 


F(10)  Generated  by  APRXMT 
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To  obtain  a  random  value  x  corresponding  to  the  random  variable  X  with 
a  given  distribution  F(x) ,  we  obtain  a  random  number  y  -  U(0,l)  then 
solve  for  x  using  the  equation 

Fx(x)  =  F (X  £  x)  =  y. 

Hence 

x  =  F_1(y). 


This  logic  is  used  in  the  subroutines  developed  by  International  Mathemati¬ 
cal  and  Statistical  Libraries,  Inc.  (IMSL)  [6],  However,  in  most  cases 
IMSL  generates  the  random  number  for  the  standard  distribution,  such  as 
the  case  for  standard  normal.  Beta,...  .  Let  r  represent  the  r.n.  obtained 
by  IMSL  subroutines.  The  necessary  transformation  is  made  to  give  us  the 
desired  r.v.  x.  Table  8  has  a  listing  of  the  distributions  and  the  necessary 
input  data  represented  by  the  first  five  columns  of  the  table.  Also  Table  8 
illustrates  the  use  of  the  above  four  parameters.  The  following  is  a  brief 
description  of  each  distribution: 


1  -  Uniform  Distribution: 


f(x) 


if  u  £  x  £  v 
otherwise 


r:  is  uniformly  distributed  between  0  and  1. 

x  =  u  +  r(v-u) 


V  =  (u+v) / 2 
o2  =  (v-u)2/12 
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2  -  Triangular  Distribution:  Let  k^  =  (v-u) (m-u) 


and  k^  =  (v-u) (v-m) 


2(x-u)/k^  if  u  ^  x  _<  m 


f  (x)  =  s  2(v-x)/k2  if  m  ^  x  _<  v 


otherwise 


r:  is  between  0  and  1,  and  has  a  uniform  distri¬ 


bution 


u  +  /  rk,  if  0  <  r  <  y  where  0  <  y  = 

1  —  —  —  v-u 


v  -  /(l-r)k„  if  y  _<  r  £  1 


U  *  (u-Hn+v)./3 


a  »  [u(u-m)  +  v(v-u)  +  m(m-u)]/18 


3  -  Normal  Distribution: 


*  1  -(x-u)2/2o2  . 

f  (x)  =  -  e  for  -°°  <  x  < 


and  o  >  0 


r:  is  a  random  number  from  a  standard  normal  distribution 


x  =  u  +  or 


4  -  Exponential  Distribution: 


—  exp(-  x/u)  for  x  >  0  and  a  >  0 
a 


1 


otherwise 


r:  is  a  random  number  from  an  exponential  distribution 


7  -  Discrete  Distributions:  This  category  includes  all  discrete  dis¬ 
tributions  and  any  customer  oriented  distribution.  Its  input  consists  of 


a  finite  set  of  ordered  pairs,  { (R  ,  p(R  ))}.  To  generate  a  random  number 

m  m 

from  any  of  these  distributions,  a  random  number  r  -  U(0,1)  is  generated, 
then  the  desired  random  realization  is  given  by 

x  =  P_1(r) . 


Distribution 

Index 

I 

EX  (I) 

STDX(I) 

VMIN (I) 

VMAX(I) 

1MSL 

Subroutine 

Uniform 

1 

0.00 

0.00 

u 

V 

GGUBS 

Triangular 

2 

m 

0.00 

u 

V 

GGUBS 

Normal* 

3 

V 

a 

u 

V 

GGNML 

Exponential* 

4 

a 

0.00 

u 

V 

GGEXN 

Gamma* 

5 

a 

6 

u 

V 

GGAMR 

Beta 

6 

a 

6 

u 

V 

GGBTR 

Discrete 

7 

not  applicable 

GGUBS 

Table  8 

Input  Parameters  for  SIMULT 

- in- -the -Normal,  Exponential- and- Gamma— distributions,  marked  by  *  in 

Table  8,  a  random  value-  is  accepted  only  if  it  is  in  the  interval  [u,v]; 
otherwise,  it  is  rejected  and  a  new  random  value  is  generated. 

2  -  Comparing  the  Approximate  pdf  with  the  "true"  pdf:  The  "true" 
pdf  of  the  project  completion  time  is  represented  by  the  pdf  obtained  from 
the  extensive  Monte  Carlo  sampling  denoted  by  F' (N) ;  the  cardinality  of 


F'(N)  is  NIN.  This  distribution  is  compared  with  the  approximate  pdf, 
denoted  by  F(N)  which  has  NRR  realizations.  The  objective  of  the  com¬ 
parison  is  to  determine  the  maximum  absolute  deviation  (MDV) ,  the  average 
value  of  the  absolute  deviations  (ADV)  between  the  two  distributions,  and 
the  mean  and  the  standard  deviation  of  each  distribution.  The  deviations 
are  computed  using  linear  interpolation.  Figure  10  illustrates  the  con¬ 
cept  of  linear  interpolation  where  the  points  generating  the  solid  line 
represent  the  approximate  pdf,  F(N),  and  the  scattered  points  in  the 
plane  (R,  P(R))  represent  the  "true"  pdf,  F'(N). 
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If  the  number  of  realizations  in  F*  (N)  is  n  =  NIN,  then  we  have  n  devia¬ 


tions.  Let 


Hence , 


D^:  be  the  k  deviation  where  k  =  1,2,..., n. 


MDV  =  Max{|D  | } 
k  * 


ADV  =  l  |D  J/n. 


In  the  approximate  pdf  the  minimum  and  maximum  realization  times  of 
the  project  are  always  preserved;  they  are  represented  by  R^  and  Rj^ 


respectively.  Hence 


RUR1 


and 

Therefore,  for  any  R^  of  the  true  realizations,  k  =  l,2,...,n,  there 
exists  an  approximate  realization  R^  where  j  <  NRR  such  that 

Rj  -  K  -  Rj « 

Using  this  relation  we  can  obtain  the  equation  of  the  line  segment  con¬ 
necting  the  two  points  (Rj,  P(Rj))  and  (Rj+^,  P(Rj+^))«  If  such  a  line 


is  denoted  by 


y  =  rx  +  s 


where  x  is  the  realization  axis  and  y  is  the  probability  axis,  then  the 


slope  of  the  line  is 


r  -  (P(R  -  P(Rj))/(RJ+1  -  Rj) 


and  the  intercept  is 


s  -  P(R  )  -  rR  . 
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Hence,  if  is  given, then  using  the  line  equation  we  calculate  the 
corresponding  approximate  probability  P(R^)  where 

P(R^)  =  y  =  rR^  +  s, 


Dk  =  P(Ri)  “  P'(Rk)' 


See  Figure  10  for  illustration. 

The  linear  interpolation  is  carried  out  for  all  realizations  except 
for  those  which  lie  in  the  0.01  left  and  right  tails  of  the  sampled  dis¬ 
tribution,  The  exclusion  of  the  two  tails  does  not  alter  the  values  of 


MDV  and  may  alter  ADV  only  slightly,  and  speeds  up  the  interpolation  since 
many  realizations  with  negligible  probabilities  might  lie  in  the  tails. 

For  example,  the  application  of  the  linear  interpolation  to  Tables  6  and 
7  led  to  the  exclusion  of  the  first  four  and  last  five  realizations  of 
Table  6.  Table  8  has  the  complete  output  of  the  linear  interpolation;  the 
third  column  in  the  table  headed  by  "APRXMTD  PROB."  represents  P(R^)  and 
the  last  column  represents  D^.  Figure  11  is  a  digital  plot  of  the  first 
three  columns  of  Table  8.  The  symbol  (•)  represents  the  sampled  distribute 
where  (-)  represents  the  approximate  distribut ion  and  (x)  is  used  whenever 
(.)  and  (-)  are  to  be  printed  in  the  same  location  in  the  xy-plane. 


I 

REALIZATION 

SAMPLED  PROR 

1 

1  9.6 

.200  00  E— 0 

2 

20.5 

. 35000F- 0 

3 

21.3 

.570006-0 

4 

22.  2 

•  8  50  0  CE -  0 

5 

23.0 

.  12500 

6 

2  3.9 

.18100 

7 

24.9 

.23700 

8 

25.7 

.31700 

9 

26.5 

.40200 

1  0 

27.4 

. 46400 

I  1 

28.2 

.56400 

1  2 

29.1 

.64600 

1  3 

30.0 

. 72500 

1  4 

30.9 

. 79500 

1  5 

31.7 

.85500 

16 

32.5 

. 88700 

1  7 

33.5 

.92200 

18 

34.2 

. 94800 

19 

35.0 

.57000 

20 

36.1 

.53000 

2  I 

36.9 

. 58800 

APRXMTD  PROS.  ACTUAL  DIFFERENCE 


.141 73E-01 

-.58274E-02 

•  2  3 1 5  IE—  C 1 

-.  1  1 B49E-01 

.38755E-01 

-. 18245E-01 

.6353  IF -01 

-.2 1469E-01 

•  9  7532E—  0  1 

—  •  27468E—  0 1 

.16163 

-. I9373E-01 

. 23422 

—  . 2  7838E  — 02 

.30321 

—  • 1 379  3E—  0 1 

.38369 

-. 1 8304E-01 

. 4  773  3 

- • 66729E- 02 

.56206 

-.193956-02 

.64554 

-.45729E— 03 

.71579 

-.92102E-02 

.78020 

-.1 4793F-0 1 

.33306 

—  .  2 19  39E—  0 1 

.881 94 

—  •  50  56 4 E—  0  2 

.92312 

. 1  1  195E-02 

.94687 

-. 1  1 306E—  02 

.96652 

-.34733E-02 

.58372 

. 37264E-02 

.98938 

.  13794E-02 

The  Average  of  the  Absolute  Values  of  the  Deviations  = 
The  Maximum  of  the  Absolute  Values  of  the  Deviations  = 
Number  of  Positive  Deviates  =  3 

Number  of  Negative  Deviates  =  18 


.  70003E-02 

.  27468E-01 • It  is  No.  5 


Table  8 


Comparison  of  the  Sampled  and  the  Approximate 
Probability  Distribution  Functions 


xiirwtc  Probability  Distribution  Functi 
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VII.  COMPUTATIONAL  EXPERIENCE  AND  CONCLUSIONS 


It  is  very  difficult  to  judge  the  accuracy  of  the  approximate  pdf  of 
the  project  completion  time  since  the  exact  pdf  is  not  known,  (it  may  be 
known  for  small  ANs),  and  the  literature  does  not  report  other  approximat¬ 
ing  procedures  beside  the  various  forms  of  MCS.  Therefore,  as  it  was  clear 
from  the  previous  section,  we  had  to  compare  the  approximate  pdf  with  that 
obtained  by  MCS  using  the  following  four  measures  of  performance :- 

1  -  Average  value  of  the  distribution 

2  -  Standard  deviation 

3  -  The  maximum  of  the  absolute  values  of  the  deviations  (MDV) 

4  -  The  average  of  the  absolute  values  of  the  deviations  (ADV) . 

The  variations  in  the  measures  of  performance  depend  on  the  structure  and 
size  of  the  AN,  the  distributions  of  the  activity  times,  the  sample  size 
in  MCS,  the  accuracy  of  the  discretization,  and  the  values  of  NRR  and  NIN. 
In  this  section  we  discuss  the  impact  of  these  factors  on  the  measures  of 
performance  (MOP)  and  conclude  the  section  with  some  conclusions  concern¬ 
ing  the  approximating  procedure. 

Table  9  shows  how  the  distribution  type  affects  the  MDP  for  a  randomly 
generated  AN  with  N  =  10  and  { A |  =  15  where  the  sample  size  is  set  equal  to 

1000.  The  parameters  of  the  distributions  used  are  given  in  Table  10.  In 

each  of  the  eight  problems  considered  in  Table  9,  the  approximate  average 
value  of  the  project  completion  time  is  within  1%  of  the  sampled  average. 
The  approximate  average  is  slightly  higher  than  the  sampled  average,  while 

the  approximate  standard  deviation  is  less  than  the  sampled  stan  ard  devia¬ 


tion.  The  graph  of  the  density  functions  of  each  problem  has  the  form 


*For  the  specification  of  each  distribution  see  Table  10  below. 

Table  9 

Sensitivity  of  the  Approximation 
Procedure  to  the  PDF’s 


Dist.  Type 

EX 

STDX 

VMIN 

VMAX 

! 

{(2.0,0.20), (3.0,0.30) ,(4.0,0.30), (5. 0,0. 20) } 


Table  10 


Probability  Distribution  Functions 
Used  in  the  Analysis 
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shown  in  Figure  12.  The  sampled  graph  converges  to  the  approximate  graph 
as  the  sample  size  increases;  Table  5  of  Section  VI  gives  an  example  of 
such  convergence.  The  values  of  Mi)V  and  ADV  vary  with  the  time  distribu¬ 
tions  adopted;  but  in  the  eight  problems  of  Table  9  MDV  is  always  less  than 
0.08  and  ADV  is  less  than  0.03.  The  smallest  values  for  MDV  and  ADV  are 
obtained  in  problem  7  where  a  discrete  distribution  was  used,  the  second 
smallest  values  of  MDV  and  ADV  were  obtained  in  problem  8  where  some  of 
the  activities  in  problem  8  have  discrete  distribution.  This  is  expected 
since  the  errors  of  discretization  in  problem  7  do  not  exist  and  in  problem 
8  they  are  less  than  in  the  remaining  problems.  The  accuracy  of  the  approx¬ 
imation  can  be  enhanced  by  having  more  accurate  discretization;  this  was 
the  case  in  problem  4  where  the  exponential  distribution  was  approximated 
by  thirty  points,  while  each  of  the  other  continuous  distributions  was 
approximated  by  only  twenty  points. 

In  Table  11  the  uniform  distribution  and  a  sample  of  size  1000  are  used 
to  examine  the  effects  of  the  AN  size  on  the  MOP.  Both  the  MDV  and  ADV 
increase  as  the  AN  size  increases;  perhaps  such  an  increase  is  due  to  retaining 
the  sample  size  constant  since  large  ANs  require  larger  sample  sizes.  The  graphs 
of  the  distributions  of  the  problems  in  Tables  9  and  1]  have  the  general  form 
of  Figures  12  and  13;  which  agree  with  the  general  forms  obtained  by  Van 
Slyke  [8]  in  sampling  different  PERT  networks.  The  measure  of  MDV,  in  most 
of  the  problems  considered,  has  its  value  from  within  the  values  of  the 
first  30%  of  the  distribution.  The  parameter  used  in  Section  VI. 2 
where 

Dk  =  P(R^)  -  P'(R^) 


-3a  -2 


Problem 


Nodes  Arcs 
(N)  (A) 


Comparison  of  the  Approximated  PDF  with  that  of  MCS 


Average 


APRX. 


27.27 

47.37 
52.30 
58.98 
56.06 
62.54 

67.38 
82.82 


27.20 

46.47 

51.89 

57.71 

55.14 

62.58 

65.56 

80.05 


S.  Deviation 


6.962 


0.0426 

0.0115 

0.0557 

0.0162 

0.0525 

0.0169 

0.0633 

0.01816 

0.0303 

0.01147 

0.0651 

0.0259 

0.0880 

0.0263 

0.1082 

0.0306 

Table  11 

Computational  experience  for  Different 
Size  AN's  with  Uniform  Distribution 
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General  Form  of  the  Sampled  and 
the  Approximate  Distributions 


Deviation 


The  Behavior  of  the  Deviation  D, 

k 


tends  to  have  negative  values  in  the  first  half  of  the  distribution  and 
positive  values  in  the  second  half.  As  the  errors  in  the  discretization 
decrease  and  the  sample  size  in  MCS  increases  the  curve  of  D^,  having  the 
general  form  of  Figure  14,  converges  toward  the  abscissa. 

The  CPU  time  requirements  of  the  approximating  procedure  excluding  the 
MCS  time  are  minimal.  It  is  always  less  than  half  a  minute  for  an  AN  of 
size  (N,A)  <  (60,200)  with  Uniform  distribution  on  AMDAHL  V-7.  The  CPU 
time  requirements  for  MCS  with  a  fixed  sample  size  depend  on  the  size  of 
the  AN  and  on  the  type  of  pdf's  used.  For  a  sample  of  size  1000  for  the 
problem  G(60,150)  with  a  Uniform  distribution  the  CPU  time  was  about  two 
minutes;  the  CPU  time  may  double  or  triple  if  other  distributions,  such 
as  the  Normal  or  Beta,  are  used. 

From  the  preceeding  discussion  we  conclude  the  following 

1  -  The  approximation  is  at  its  best  if  the  activities  have  discrete 
distributions  to  start  with.  The  accuracy  of  the  approximation  can  be 
improved  by  reducing  the  errors  of  discretization. 

2  -  The  distribution  of  the  project  completion  time  approaches  normality 
regardless  of  the  type  of  the  distributions  used  at  the  outset.  This  was 
the  case  in  all  the  problems  tested.  The  approximated  mean  and  standard 
deviation  of  the  distribution  are  very  close  to  the  "true"  mean  and 
standard  deviation.  In  fact  it  is  bounded  from  below  by  the  best  known 
estimate,  which  is  developed  by  Elmaghraby  [2],  and  bounded  from  above  by 
the  true  mean. 

3  -  In  comparison  with  the  pdf  obtained  by  MCS,  the  sampled  distribution 
converges  toward  the  approximate  pdf  as  the  sample  size  increases. 


4  -  The  maximum  value  of  the  absolute  deviation,  MDV,  is  within  the 
first  30%  of  the  distribution;  this  observation  increases  the  applicability 
of  the  approximate  pdf  since  the  realizations  of  the  major  interest  are 
those  on  the  right  half  or  tail  of  the  pdf. 

5  -  The  processing  time  requirements  of  the  approximating  procedure 
excluding  the  sampling  time  are  minimal.  It  is  always  less  than  half  a 
minute  for  any  AN  of  size  (N,A)  5  (60,200)  on  AMDAHL  V-7. 
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APPENDIX  A 


RANDOM  NETWORK  GENERATOR 


The  nodes  in  G(N,A)  are  numbered  such  that  an  arc  always  leads  from 
a  small  number  to  a  larger  one,  and  there  is  only  one  start  and  one  end 
node  to  the  AN.  An  immediate  consequence  of  such  a  numbering  scheme  is 
that  the  adjacency  matrix  is  always  upper  triangular  with  zero  diagonal. 
A  typical  AN  and  adjacency  matrix  is  given  in  Figure  A.l  below  for  N  =  4 
and  | Aj  =5. 


12  3  4 

0  110 
0  1  1 

0  1 
0 


T.i!)  li-  A.l 

An  Activity  Network,  and 
Its  Adjacency  Matrix 


Another  consequence  is  that  for  a  given  N  and  j A |  several  feasible 
G(N,A)  may  be  generated.  Figure  A. 2  lists  the  other  (three)  alternative 


AN  types  for  N  =  4  and  | A [  =5 


Figure  A. 2 

The  Remaining  Feasible  ANs  with  N  =  4,  | A |  =5 

Consequently,  the  random  generation  of  a  G(N,A)  for  a  fixed  N  and  | A  j  implies 
that  the  resultant  network  types  should  have  equal  probabilities  of  occur¬ 
rence  .  In  general,  the  following  two  procedures  should  be  able  to  satisfy 
this  requirement.  The  first  method  denoted  by  the  "Deletion  Method”  starts 
from  the  completely  connected  AN;  i.e.,  the  adjacency  matrix  filled  with 
ones,  and  deletes  the  necessary  number  of  arcs  until  | A  j  arcs  are  left. 

This  is  done  by  substituting  zeroes  for  ones  until  j  A {  ones  are  left  in 
the  adjacency  matrix.  The  second  procedure,  denoted  by  the  "Addition  Method", 
starts  with  the  unordered  AN;  i.e.,  the  adjacency  matrix  filled  with  zeroes, 
and  generates  the  required  number  of  arcs  j  A | .  This  is  done  by  substituting 
| A |  ones  for  zeroes  in  the  adjacency  matrix.  In  the  following  we  discuss 


the  rationale  of  both  methods. 


Figure  A.  2 

Completely  Connected  AN 


The  Deletion  Method  now  reduces  to  the  random  deletion  of 

N(N-l) /2  -  I A ( 

ones  in  the  adjacency  matrix,  such  that 


n .  >  1 

1  — 

for  all 

i  *  N 

and  n^  =  0 

(A. 2) 

and 

mj  -  1 

for  all 

J  *  1 

and  =  0 

(A. 3) 

For  any  AN, 

the  above  conditions 

s  i  mp  1  v 

state  that  at 

least  one  arc  must 

leave  every 

node  except 

the  last 

and  at 

least  one  arc 

should  enter  every 

node  except  the  first. 
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The  Deletion  Method  should  generate  activity  networks  with  equal 
probabilities  for  the  different  feasible  network  types,  i.e.,  all  existing 
ones  in  the  adjacency  matrix  for  the  completely  connected  network  should 
receive  equal  deletion  probabilities  given  the  above-mentioned  consistency 
constraints.  This  can  be  achieved  by  numbering  all  the  ones  in  the  adjacency 
matrix  for  the  completely  connected  AN  from  left  to  right  and  consecutively 
in  the  rows,  as  illustrated  in  Figure  A. 4. 


Figure  A. 4 

Label  and  Probability  Assignment 


The  corresponding  numbers  are  then  assigned  to  equal  intervals  in  the  range 
of  a  uniformly  distributed  variable.  Then  drawing  a  random  number  yields 
ari  interval  which,  in  turn,  identifies  the  label  of  a  corresponding  arc. 

The  interval  corresponding  to  a  node  i*  has  a  length  equal  to  (N-i*)  times  the 
interval  length  of  a  label.  For  example  in  Figure  A. 4  the  interval  corres¬ 
ponding  to  node  i*  =  2  has  a  length  of 


(4-2) (1/6)  =  1/3. 


It  can  also  be  seen  from  Figure  A. 4  that  i*  =  2  is  preceded  by  3  intervals. 


each  is  of  length  1/6;  i.e.,  in  general,  node  i*  is  preceded  by  at  least 

l  (N-i)  =  (i*-l)N  -  i*(i*-l)/2  (A. 4) 

0<i<i* 

labeled  intervals. 

In  order  to  generate  an  i*,  let  Y  -  U(Q,1)  and  let 

X  =  Y 'N(N-l) / 2  (A. 5) 

where  N(N-l)/2  denotes  the  total  number  of  labels  (total  number  of  arcs  in  the 
AN).  Now  the  interval  relation  between  X  and  i*  implies  that  (see  Eq .  (A. 4)) 

X  _>  (i*-l)N  -  i*(i*-l)/2 

or  with  a  >  0  we  have 

i*2 / 2  -  (N+l/2) i*  +  (N+X-a)  =  0  , 

which  yields 

i*  =  (N+l/2)  ±  ( N+l/2)2  -  2 (N+X-a)  .  (A. 6) 

Since  i*  N  -  1  we  must  select  the  root.  Moreover,  since  a  >  0, 

Eq .  (A. 6)  reduces  to 

i*  <_  (N+l/2)  -  «y/(N+l /2) 2  -  2N  -  2X  , 

or  i*  (N+l/2)  -  -yj (N-l/2) 2  -  2X  . 

Substituting  from  Eq .  (A. 5)  yields 
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i*  <  (N+l/2)  -  ^N(N-i) (1-Y)  +  1/4 


Also  by  a  symmetrical  argument  that  considers  the  nodes  that  are  larger 
than  i*,  we  have 


X  <  i*N  -  i*(i*+l) /2 


we  find 


i*  >  (N-l/2)  -  /N(N-1)(1-Y)  +  1/4. 


But  Y  ~  U(0,1)  implies  that  (1-Y)  '  U(0,1);  hence  let  8  =  /N(N-1)Y  +  1/A, 


N-l/2  -  6  <  i*  <  N  +  1/2  -  8 


+  1/2 


-  sj 


(A.  7) 


Given  this  value  of  i,  we  draw  a  new  random  observation  of  Y  -  U(Q,1) 
and  rescale  into  X  -  U(i+1,  N+l)  by  setting 


X  =  Y(N-i*)  +  i*  +  1 


which  in  turn  yields 


J*  •  li*  +  1  *  Y  *  <*-‘*5  ■ 


(A. 8) 


The  corresponding  arc  (i*,  j*)  is  deleted  from  the  AN  provided  that  con¬ 
ditions  A. 2  and  A. 3  are  satisfied.  This  procedure  is  repeated  until 


l  \  ’  l 

i  j  J 


2  -  The  Addition  Method:  The  Deletion  Method  will  delete  N(N-l)/2  -  j A | 
arcs.  For  certain  values  of  N  and  | A [ ,  this  may  be  a  time  consuming  process. 
Suppose  we  have  to  generate  a  network  with  N  =  4  and  |a|  =  5,  then  the  dele¬ 
tion  method  will  have  to  delete  1  arc;  however,  if  N  =  100  and  A  =  150,  then 
4800  out  of  4950  arcs  need  to  be  deleted. 

Under  such  conditions,  the  Addition  Method  may  prove  to  be  less  time- 
consuming.  As  a  consequence  of  the  node  labeling  procedure  adopted,  there 
should  always  be  an  arc  connecting  nodes  1  and  2  and  an  arc  connecting  nodes 
N  -  1  and  N.  Consequently,  we  believe  the  Deletion  Method  is  to  be  preferred 
if 

|  At  >_  N(N-l)/4  +  1, 

and  we  prefer  the  Addition  Method  if  otherwise. 

Consider  now  the  previous  example  with  n  =  4  and  j  a]  =5.  Figure  A. 5 
represents  the  initial  adjacency  matrix  and  the  corresponding  network.  It 
can  be  observed  from  Figure  A. 5  that  node  2  is  not  yet  an  emitting  node  and 

1  2  3  4 

10  10  0 

2  0  0  0 

3  0  1 

4  0 

Figure  A. 5 

Initial  Network  in  the  Addition  Method 

node  3  is  not  yet  a  receiving  node.  In  general  the  initial  network  will  be 
characterized  by  n  =  N  -  3  non-emitting  nodes,  and  m  *  N  -  3  non-receiving 
nodes.  This  means  that 
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f  =  J  A  j  -  2  -  m  -  n 

of  the  remaining  arcs  may  be  inserted  arbitrarily;  in  our  example 

f  =  5  —  2—  1—  1=  1 

arc  of  the  three  remaining  may  be  generated  completely  arbitrarily  (i.e., 
both  of  its  terminal  nodes  are  arbitrary),  since  at  least  an  arc  must 
enter  node  3,  and  one  arc  must  leave  node  2.  Hence,  the  terminal  of  the 
arc  from  node  2  and  the  origin  of  the  arc  to  node  N-l  may  be  selected 
arbitrarily.  Consequently,  the  Addition  Method  will  start  from  the  initial 
network  and  adjacency  matrix  (all  a_  =  0  except  a^  =  1  and  aN_j  N  =  1). 

It  uses  formulas  (A. 7)  and  (A. 8)  to  generate  an  arc  as  long  as  the  residual 
free  arcs  f  is  >  0  where 


f  =  A-  £-  m-  n>0 

and  initially,  the  number  of  generated  arcs  l  =  2,  the  non-emitting  nodes 
n  =  N  -  3, and  the  number  of  non-receiving  nodes  m  =  N  -  3.  Each  time  an 
arc  is  generated  in  this  manner,  the  adjacency  matrix  is  updated,  the  value 
of  t  is  set  to 


1=1+1, 

and  depending  on  the  outcome  either 

m  =  m  -  1 

and/or 


n  =  n  -  1 . 


The  Addition  Method  developed  by  Herroelen  and  Caestecke  [5]  is 

represented  by  Flowchart  1.  If  t  <  A  and  f  =  0  we  check  if  m  =  0.  If 

m  i-  0,  indicating  that  there  is  at  least  one  non-receiving  node,  we  locate 
the  column  j*  in  the  adjacency  matrix  that  is  completely  filled  with  zeroes 
(if  ties  develop,  take  the  highest  column  index).  We  generate  a  correspond¬ 
ing  i*  using 

i*  =  JjL  +  (j*  -  1)yJ  where  Y  ~  U(0,1). 

Update  the  adjacency  matrix,  and  the  values  of  £,  m,  n  and  f  then  continue 

until  either  t  =  A  where  the  process  stops,  or  Z  <  A  and  f  =  m  =  0;  in 

such  a  case  we  check  if  n  =  0.  If  n  /  0,  then  there  is  at  least  one  non¬ 
emitting  node.  We  locate  any  zero  row,  i*  <  N  -  1,  in  the  adjacency  matrix, 

and  generate  a  j*  using  the  formula 

j*  =  |i*  +  1  +  (N  -  i*)Yj  where  Y  ~  U(0,1). 

We  continue  until  either  &  =  A  or  n  =  0,  where  in  either  case  the  process 

stops. 

Ideally  as  soon  as  m  =  0  and  n  =  0  we  should  have  t  =  |a|  .  However 
the  Addition  Method  presented  in  Flowchart  1  does  not  guarantee  this  result. 
If  j  A  J  <  2N  -  4  then  the  Addition  Method  may  fail  to  generate  a  feasible 
AN,  and  if  the  feasibility  conditions  are  imposed  then  the  method  may 
generate  more  arcs  than  what  is  required.  The  following  example  illustrates 
this  defficicncy: 

Example:  Let  N  =  10  and  | A ,  =  12  then  Table  1  below  summarizes  the 

steps  taken  by  the  Addition  Method  represented  by  Flowchart  1.  It  is 
obvious  that  at  step  10  we  have  £  =  j  A  j  =  12  and  according  to  Flowchart  1 
the  process  stops,  even  though  there  are  nodes  not  connected  from  above, 
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Flowchart  1 


The  Addition  Method 
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i.e.,  there  are  non-emitting  nodes.  If  at  step  10  we  let  the  process  to 
go  to  (2)  instead  of  ©  in  Flowchart  1,  then  the  process  will  terminate 
only  after  changing  the,  "go  to  2"  in  the  decision  "Is  n  =  0?"  to  stop. 
In  such  a  case  the  process  stops  after  n  reaches  zero,  where  the  number 
of  generated  arcs  can  exceed  12.  In  fact  in  this  example  the  Addition 
Method  can  generate  up  to  16  arcs.  For  the  realization  presented  in 
Table  1,  t  goes  up  to  15  arcs. 

This  deficiency  is  always  possible  for  all|Aje[N  -  1 ,  2N  -  5].  For 
|A |>  2N  -  4  the  Addition  Method  as  outlined  in  Flowchart  1  appears  to  be 
working.  In  section  II  we  modify  the  Addition  Method  to  avoid  this 
deficiency. 


Possible  Realization  of  Flowchart 
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APPENDIX  B 
THE  2m  METHOD 

It  was  mentioned  in  Section  III  that  the  first  method  of  discretizing 
a  continuous  pdf  is  to  use  the  first  2m  moments  of  the  continuous  distri¬ 
bution  to  solve  the  following  system  of  nonlinear  equations 
m 

£  x^p(x  )  =  E(xn)  =  e  for  n  =  0,1,2, ...  ,2m- 1 

k«  k  tc  n 

=  1 

In  a  matrix  form  we  have 


VP  =  E. 

The  following  two  methods  have  been  tried  to  solve  this  system  of  nonlinear 
equations,  but  neither  system  succeeded  in  solving  it  for  m  >  8.  These 
methods  are: 

1  -  Brown  Method:  which  is  documented  in  IMSL  [6]  library  under  the 
name  ZSYSTM.  Starting  with  an  initial  solution  ZSYSTM  is  supposed  to  con¬ 
verge  to  a  solution  within  e  from  a  feasible  solution.  However,  many  runs 
to  different  values  of  m  and  different  initial  solutions  proved  that  ZSYSTM 
was  not  converging,  and  often  terminated  because  of  a  singularity  that 
occurred  in  the  iterations,  due  mainly  to  the  nature  of  Vendermonde  matrix 
V.  Two  other  packages  SBROWN  and  SNGINT  developed  by  the  Argonne  National 
Laboratory  have  been  tried;  neither  succeeded  in  solving  the  above  system. 
This  failure  led  to  the  search  for  other  methods.  The  following  method 
was  successful  in  solving  the  above  system,  but  only  for  small  value  of 
m  (<  8)  . 


(1) 
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2  -  Gaussian  Quadrature  [4]:  To  solve  the  above  system  of  nonlinear 
equations  the  procedure  is  as  follows: 

(i)  Determine  the  sample  polynomial 


"(x)  =  l  c  x  , 


The  coefficients  {c^}  are  determined  uniquely  using  the  follow¬ 
ing  system  of  linear  equations  after  setting  c  ■  1. 


c  e  +  c.e.  +  c„e„  +  . .  .+  c  ,e  _  +  e  =  0 
00  11  2  2  m-1  m-1  m 


c  e  +  c  e  +  c.e  +...+  c  e  +  e  =  0 
o  1  12  23  m-1  m  m+1 


c  e  +  c  e  +  c.e  +. . .+  c  .e  ..  +  e  .  =  0 
o  2  13  24  m-1  m+1  m+2 


c  e  ,  +  c.e  +  c.e  +...+  c  .e.  +  e„  .  =  0 

o  m-1  1  m  2  m+1  m-1  2m-2  2m-l 

(ii)  The  set  of  realizations  (discrete  points)  are  determined  by  solving 
the  polynomial 


I  c,x  =0 


where  all  m  solutions  are  simple  and  real  (since  0  <  u  <  x  <  v) . 


(iii)  The  corresponding  probabilities  are  determined  by  substituting 
for  x^  in  the  first  m  equations  of  (1)  then  solve  uniquely  for 
P(*k> • 

This  algorithm  was  programmed  and  tested;  it  works  for  m  <  8.  It  is 
not  recommended  for  discretization  since  it  is  very  sensitive  to  the  values 


of  E(x  ),  and  requires  the  solution  of  two  systems  each  of  m  linear  equa- 


tions,  and  the  solution  of  a  polynomial  of  the  mC^  degree,  each  time  it 
is  used  to  approximate  a  distribution.  Furthermore,  the  user  would  never 
know  when  the  procedure  will  "blow-up". 
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APPENDIX  C 

7 

THE  INPUT  FORMAT 


The  approximating  procedure  has  been  programmed  and  for  the  ease  and 
flexibility  of  its  use,  the  program  is  in  two  parts:  the  first  is  used 
when  the  AN  is  given  and  the  second  is  used  if  the  AN  is  to  be  generated. 
This  section  describes  the  input  requirements  of  each  part. 


1  -  Input  for  an  available  AN:  The  input  data  is  listed  according  to 
the  following  order  and  format. 

(a)  Control  Card:  It  is  the  first  input  card;  it  contains  the 
control  parameters  listed  in  the  following  order  according 
to  the  format  (F5.3,  715); 

SCAL,N,M,NRR,NCONT,MCS,NSIM,KEY 

where 


SCAL 

N 


M 


NRR 

NCONT 


MCS 


A,  the  interval  width  (mesh)  used  in  DISCRT. 

Number  of  nodes 
j  Aj  ,  number  of  arcs 

Number  of  desired  ordered  pairs  in  the  approximated  distribution, 
i  0  if  the  AN  has  no  arcs  with  continuous  distribution 

i 

! 

(_  1  otherwise 

j  0  if  the  Monte  Carlo  sampling  is  not  desired 


1  otherwise 

NSIM  *  Number  of  samples  if  MCS  =  1 

•  Number  of  milestones  (key  nodes). 


KEY 
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(b)  List  of  the  KEY  nodes:  These  are  listed  in  a  non-decreasing 
order  according  to  the  format  (1615).  They  are  integers  de¬ 
noted  by  the  symbol  KEYN. 

(c)  Identity  of  the  Activities:  This  consists  of  four  integer 
values  listed  on  one  card,  according  to  the  format  415,  for 
each  activity.  These  values  are: 

NS, NE, NDSTT, NR 

where 

NS:  starting  node 
NE:  end  node 

NDSTT  =  1,2,. ..,7,  indicator  of  the  activity  pdf. 

NR:  number  of  ordered  pairs  of  distribution  if  NDSTT  ■  7. 

(d)  Distribution  Parameters:  Those  are  read,  in  the  order  of  the 
arcs  (which  is  the  topological  order  of  the  AN).  If 
NDSTT(a)  7,  i.e.,  the  arc  has  a  continuous  distribution, 
then  the  following  four  values  listed  according  to  the  format 
(4F10.4)  are  needed  for  each  activity.  These  are: 

EX , STDX , VMIN , VMAX 

which  are  explained  in  Table  8  of  Section  VI. 

If  NDSTT (a)  =  7  then  NR(a)  ordered  pairs,  {(R(k),  p(R(k)))}, 
are  listed  according  to  the  format  (4(F10.2,  F10.4)),  hence 
each  card  contains  four  ordered  pairs. 

NOTE:  The  (d)  part  of  the  input  assumes  each  activity  has  its  own  distribu¬ 
tion.  The  input  routine  can  be  changed  to  accommodate  the  assignment  of  a 
given  distribution  to  a  set  of  activities. 


2  -  Input  if  the  AN  is  to  be  Generated:  The  input  for  this  part  is 
limited  to  the  following  segments: 

(a)  Control  Card:  as  in  1(a)  above. 

(b)  List  of  the  KEY  nodes:  as  in  1(b)  above. 

(c)  Activity-Distribution  Assignment:  This  is  a  vector  of  length 
NOI,  where  NOI  is  the  number  of  intervals  (partitions)  of  the 
set  A,  where  each  partition  has  one  p.d.f.  Each  entry  in  this 
vector  consists  of 

NULT.NDS ,NT 

where 

NULT:  number  of  the  activity  representing  the  upper  limit 
of  the  partition 

NDS:  1,2,..., 7,  indicator  of  the  pdf  of  the  partition 
NT:  number  of  ordered  pairs  of  the  pdf  if  NDS  =  7 
This  vector  is  listed  according  to  the  format  (1615). 

(Notice  that  the  first  entry  at  the  beginning  of  the  vector 
is  the  value  of  NOI). 

(d)  Distribution  Parameters:  as  in  1(d)  above. 

At  the  end  of  both  parts  of  the  input  data  we  add  the  information  needed 
for  the  digital  plotter,  the  plotter  is  USPLT  of  the  IMSL  Library.  It  is 
listed  on  three  cards  as  follows: 

1  -  First  card, which  has  the  format  (4F10.2,  10A1,  415) , contains  the 
following: 

(a)  RAN(I)  for  I  =  1,2, 3, 4:  Four  values  specifying  the  minimum  and 
maximum  of  the  x  and  y  axis  respectively.  If  RAN(I)  are  set 
equal  to  zero,  then  the  program  determines  the  x  and  y  ranges. 
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(b)  PCH:  Contains  up  to  10  plot  characters,  one  for  each 

function.  If  not  specified  leave  PCH(l)  and  PCH(2) 
blank. 

(c)  IOP:  A  zero  one  input  parameter  indicating  number  of 

printer  columns  available. 

(d)  INC:  Displacement  between  values  in  x  to  be  considered. 

(e)  IY:  First  dimension  of  the  arry  y. 

(f)  NF:  Number  of  functions  to  be  plotted. 

2  -  Second  card  contains  72  characters  of  title  information. 

3  -  Third  card  contains  36  characters  for  each  axis  for  its  annotation. 


